Self-distilled Feature Aggregation for Self-supervised Monocular Depth Estimation

نویسندگان

چکیده

Self-supervised monocular depth estimation has received much attention recently in computer vision. Most of the existing works literature aggregate multi-scale features for prediction via either straightforward concatenation or element-wise addition, however, such feature aggregation operations generally neglect contextual consistency between features. Addressing this problem, we propose Self-Distilled Feature Aggregation (SDFA) module simultaneously aggregating a pair low-scale and high-scale maintaining their consistency. The SDFA employs three branches to learn offset maps respectively: one map refining input other two under designed self-distillation manner. Then, an SDFA-based network self-supervised estimation, design self-distilled training strategy train proposed with module. Experimental results on KITTI dataset demonstrate that method outperforms comparative state-of-the-art methods most cases. code is available at https://github.com/ZM-Zhou/SDFA-Net_pytorch.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Self-Supervised Monocular Image Depth Learning and Confidence Estimation

Convolutional Neural Networks (CNNs) need large amounts of data with ground truth annotation, which is a challenging problem that has limited the development and fast deployment of CNNs for many computer vision tasks. We propose a novel framework for depth estimation from monocular images with corresponding confidence in a selfsupervised manner. A fully differential patch-based cost function is...

متن کامل

Fusion of stereo and still monocular depth estimates in a self-supervised learning context

We study how autonomous robots can learn by themselves to improve their depth estimation capability. In particular, we investigate a self-supervised learning setup in which stereo vision depth estimates serve as targets for a convolutional neural network (CNN) that transforms a single still image to a dense depth map. After training, the stereo and mono estimates are fused with a novel fusion m...

متن کامل

Self-supervised Monocular Road Detection in Desert Terrain

We present a method for identifying drivable surfaces in difficult unpaved and offroad terrain conditions as encountered in the DARPA Grand Challenge robot race. Instead of relying on a static, pre-computed road appearance model, this method adjusts its model to changing environments. It achieves robustness by combining sensor information from a laser range finder, a pose estimation system and ...

متن کامل

Self-Supervised Siamese Learning on Stereo Image Pairs for Depth Estimation in Robotic Surgery

INTRODUCTION Robotic surgery has become a powerful tool for performing minimally invasive procedures, providing advantages in dexterity, precision, and 3D vision, over traditional surgery. One popular robotic system is the da Vinci surgical platform, which allows preoperative information to be incorporated into live procedures using Augmented Reality (AR). Scene depth estimation is a prerequisi...

متن کامل

Self-Supervised Depth Learning for Urban Scene Understanding

As an agent moves through the world, the apparent motion of scene elements is (usually) inversely proportional to their depth.1 It is natural for a learning agent to associate image patterns with the magnitude of their displacement over time: as the agent moves, far away mountains don’t move much; nearby trees move a lot. This natural relationship between the appearance of objects and their mot...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2022

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-031-19769-7_41